Approximating Hierarchical MV-sets for Hierarchical Clustering
نویسندگان
چکیده
The goal of hierarchical clustering is to construct a cluster tree, which can be viewed as the modal structure of a density. For this purpose, we use a convex optimization program that can efficiently estimate a family of hierarchical dense sets in high-dimensional distributions. We further extend existing graph-based methods to approximate the cluster tree of a distribution. By avoiding direct density estimation, our method is able to handle high-dimensional data more efficiently than existing density-based approaches. We present empirical results that demonstrate the superiority of our method over existing ones.
منابع مشابه
Incidence of missing values in hierarchical clustering of microarrays data
Microarrays allow to determine expressed genes in a given cell type, at a given time and under particular experimental conditions. These experiments are performed on a huge scale and produce numerous data. Methods like Hierarchical Clustering (HC) [1] or Self Organizing Map [2] are often used to identify co-expressed genes. The data often contain several missing values (MV) due to experimental ...
متن کاملGraph Clustering by Hierarchical Singular Value Decomposition with Selectable Range for Number of Clusters Members
Graphs have so many applications in real world problems. When we deal with huge volume of data, analyzing data is difficult or sometimes impossible. In big data problems, clustering data is a useful tool for data analysis. Singular value decomposition(SVD) is one of the best algorithms for clustering graph but we do not have any choice to select the number of clusters and the number of members ...
متن کاملمقایسه نتایج خوشهبندی سلسله مراتبی و غیرسلسله مراتبی پروتئینهای مرتبط با سرطانهای مری، معده و کلون براساس تشابهات تفسیر هستیشناسی ژنی
Background and Objective: Using proteomic methodologies and advent of high-throughput (HTP) investigation of proteins has created a need for new approaches in bioinformatics analysis of experimental results. Cluster analysis is a suitable statistical procedure that can be useful for analyzing these data sets. Materials and Methods: In this research study, the identified proteins associated wi...
متن کاملAssessment of the Performance of Clustering Algorithms in the Extraction of Similar Trajectories
In recent years, the tremendous and increasing growth of spatial trajectory data and the necessity of processing and extraction of useful information and meaningful patterns have led to the fact that many researchers have been attracted to the field of spatio-temporal trajectory clustering. The process and analysis of these trajectories have resulted in the extraction of useful information whic...
متن کاملBayesian Hierarchical Cross-Clustering
Most clustering algorithms assume that all dimensions of the data can be described by a single structure. Cross-clustering (or multiview clustering) allows multiple structures, each applying to a subset of the dimensions. We present a novel approach to crossclustering, based on approximating the solution to a Cross Dirichlet Process mixture (CDPM) model [Shafto et al., 2006, Mansinghka et al., ...
متن کامل